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ENTERED 



1 SEQUENCE LISTING 
2 

3 (1) General Information: 
4 

5 (i) APPLICANT : Aldis Darzins 

6 Gregory T. Mrachko 
7 

8 (ii) TITLE OF INVENTION: A Sphingomonas Biodesulfurization 

9 Catalyst 
10 

11 (iii) NUMBER OF SEQUENCES: 13 
12 

13 (iv) CORRESPONDENCE ADDRESS: 

14 (A) ADDRESSEE: Hamilton, Brook, Smith & Reynolds, P.C. 

15 (B) STREET: Two Militia Drive 

16 (C) CITY: Lexington 

17 (D) STATE: Massachusetts 

18 (E) COUNTRY: USA 

19 (F) ZIP: 02173 
20 

21 (V) COMPUTER READABLE FORM: 

22 (A) MEDIUM TYPE: Floppy disk 

2 3 (B) COMPUTER: IBM PC compatible 

24 (C) OPERATING SYSTEM: PC-DOS/MS-DOS 

25 (D) SOFTWARE: Patentln Release #1.0, Version #1.30 
26 

27 (vi) CURRENT APPLICATION DATA: 

28 (A) APPLICATION NUMBER: US 08/851,089 

29 (B) FILING DATE: 05-MAY-1997 

30 (C) CLASSIFICATION: 
31 

32 (Vii) PRIOR APPLICATION DATA: 

33 (A) APPLICATION NUMBER: US 08/835,292 

34 (B) FILING DATE: 07-APR-1997 
35 

36 (viii) ATTORNEY/AGENT INFORMATION: 

37 (A) NAME: Elmore, Carolyn S. 

38 (B) REGISTRATION NUMBER: 37,567 

39 (C) REFERENCE/ DOCKET NUMBER: EBC97-06A2 
40 

41 (ix) TELECOMMUNICATION INFORMATION: 

42 (A) TELEPHONE: (781) 861-6240 

43 (B) TELEFAX: (781) 861-9540 
44 

45 

46 (2) INFORMATION FOR SEQ ID NO:l: 
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47 

48 (i) SEQUENCE CHARACTERISTICS: 

49 (A) LENGTH: 1362 base pairs 

50 (B) TYPE: nucleic acid 

51 (C) STRANDEDNESS: single 

52 (D) TOPOLOGY: linear 
53 

54 (ii) MOLECULE TYPE: DNA (genomic) 

55 

56 

57 (ix) FEATURE: 

58 (A) NAME/KEY: CDS 

59 (B) LOCATION: 1..1359 
60 

61 

62 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

63 

64 ATG ACC GAT CCA CGT CAG CTG CAC CTG GCC GGA TTC TTC TGT GCC GGC 48 

65 Met Thr Asp Pro Arg Gin Leu His Leu Ala Gly Phe Phe Cys Ala Gly 

66 1 5 10 15 
67 

68 AAC GTC ACG CAC GCC CAC GGA GCG TGG CGC CAC GCC GAC GAC TCC AAC 96 
6 9 Asn Val Thr His Ala His Gly Ala Trp Arg His Ala Asp Asp Ser Asn 
70 20 25 30 

71 

72 GGC TTC CTC ACC AAG GAG TAC TAC CAG CAG ATT GCC CGC ACG CTC GAG 144 

73 Gly Phe Leu Thr Lys Glu Tyr Tyr Gin Gin lie Ala Arg Thr Leu Glu 

74 35 40 45 
75 

76 CGC GGC AAG TTC GAC CTG CTG TTC CTT CCC GAC GCG CTC GCC GTG TGG 192 

77 Arg Gly Lys Phe Asp Leu Leu Phe Leu Pro Asp Ala Leu Ala Val Trp 

78 50 55 60 
79 

80 GAC AGC TAC GGC GAC AAT CTG GAG ACC GGT CTG CGG TAT GGC GGG CAA 240 

81 Asp Ser Tyr Gly Asp Asn Leu Glu Thr Gly Leu Arg Tyr Gly Gly Gin 

82 65 70 75 80 
83 

84 GGC GCG GTG ATG CTG GAG CCC GGC GTA GTT ATC GCC GCG ATG GCC TCG 288 

85 Gly Ala Val Met Leu Glu Pro Gly Val Val lie Ala Ala Met Ala Ser 

86 85 90 95 
87 

88 GTG ACC GAA CAT CTG GGG CTG GGC GCC ACC ATT TCC ACC ACC TAC TAC 336 

89 Val Thr Glu His Leu Gly Leu Gly Ala Thr lie Ser Thr Thr Tyr Tyr 

90 100 105 110 
91 

92 CCG CCC TAC CAT GTA GCC CGG GTC GTC GCT TCG CTG GAC CAG CTG TCC 384 

93 Pro Pro Tyr His Val Ala Arg Val Val Ala Ser Leu Asp Gin Leu Ser 

94 115 120 125 
95 

96 TCC GGG CGA GTG TCG TGG AAC GTG GTC ACC TCG CTC AGC AAT GCA GAG 4 32 

97 Ser Gly Arg Val Ser Trp Asn Val Val Thr Ser Leu Ser Asn Ala Glu 

98 130 135 140 
99 
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100 GCG CGC AAC TTC GGC TTC GAT GAA CAT CTC GAG CAC GAT GCC CGC TAC 480 

101 Ala Arg Asn Phe Gly Phe Asp Glu His Leu Asp His Asp Ala Arg Tyr 

102 145 150 155 160 
103 

104 GAT CGC GCC GAT GAA TTC CTC GAG GTC GTG CGC AAG CTC TGG AAC AGC 528 

105 Asp Arg Ala Asp Glu Phe Leu Glu Val Val Arg Lys Leu Trp Asn Ser 

106 165 170 175 
107 

108 
109 

110 TGG GAT CGC GAT GCG CTG AC A CTC GAC AAG GCA ACC GGC CAG TTC GCC 576 

111 Trp Asp Arg Asp Ala Leu Thr Leu Asp Lys Ala Thr Gly Gin Phe Ala 

112 180 185 190 
113 

114 GAT CCG GCT AAG GTG CGC TAC ATC GAC CAC CGC GGC GAA TGG CTC AAC 6 24 

115 Asp Pro Ala Lys Val Arg Tyr lie Asp His Arg Gly Glu Trp Leu Asn 

116 195 200 205 
117 

118 GTA CGC GGG CCG CTT CAG GTG CCG CGC TCC CCC CAG GGC GAG CCT GTC 672 

119 Val Arg Gly Pro Leu Gin Val Pro Arg Ser Pro Gin Gly Glu Pro Val 

120 210 215 220 
121 

122 ATT CTG CAG GCC GGG CTT TCG GCG CGG GGC AAG CGC TTC GCC GGG CGC 720 

123 lie Leu Gin Ala Gly Leu Ser Ala Arg Gly Lys Arg Phe Ala Gly Arg 

124 225 230 235 240 
125 

126 TGG GCG GAC GCG GTG TTC ACG ATT TCG CCC AAT CTG GAC ATC ATG CAG 7 68 

127 Trp Ala Asp Ala Val Phe Thr lie Ser Pro Asn Leu Asp lie Met Gin 

128 245 250 255 
129 

130 GCC ACG TAC CGC GAC ATA AAG GCG CAG GTC GAG GCC GCC GGA CGC GAT 816 

131 Ala Thr Tyr Arg Asp lie Lys Ala Gin Val Glu Ala Ala Gly Arg Asp 

132 260 265 270 
133 

134 CCC GAG CAG GTC AAG GTG TTT GCC GCG GTG ATG CCG ATC CTC GGC GAG 864 

135 Pro Glu Gin Val Lys Val Phe Ala Ala Val Met Pro lie Leu Gly Glu 

136 275 280 285 
137 

138 ACC GAG GCG ATC GCC AGG CAG CGT CTC GAA TAC ATA AAT TCG CTG GTG 912 

13 9 Thr Glu Ala lie Ala Arg Gin Arg Leu Glu Tyr lie Asn Ser Leu Val 
140 290 295 300 

141 

14 2 CAT CCC GAA GTC GGG CTT TCT ACG TTG TCC AGC CAT GTC GGG GTC AAC 960 
14 3 His Pro Glu Val Gly Leu Ser Thr Leu Ser Ser His Val Gly Val Asn 

144 305 310 315 320 

145 

146 CTT GCC GAC TAT TCG CTC GAT ACC CCG CTG ACC GAG GTC CTG GGC GAT 1008 

147 Leu Ala Asp Tyr Ser Leu Asp Thr Pro Leu Thr Glu Val Leu Gly Asp 

148 325 330 335 
149 

150 CTC GCC CAG CGC AAC GTG CCC ACC CAA CTG GGC ATG TTC GCC AGG ATG 1056 

151 Leu Ala Gin Arg Asn Val Pro Thr Gin Leu Gly Met Phe Ala Arg Met 

152 340 345 350 
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TTG CAG GCC GAG ACG CTG ACC GTG GGA GAA ATG GGC CGG CGT TAT GGC 1104 
Leu Gin Ala Glu Thr Leu Thr Val Gly Glu Met Gly Arg Arg Tyr Gly 
355 360 365 

GCC AAC GTG GGC TTC GTC CCG CAG TGG GCG GGA ACC CGC GAG CAG ATC 1152 
Ala Asn Val Gly Phe Val Pro Gin Trp Ala Gly Thr Arg Glu Gin lie 
370 375 380 



GCG GAC CTG ATC GAG ATC CAT TTC AAG GCC GGC GGC GCC GAT GGC TTC 1200 
Ala Asp Leu lie Glu lie His Phe Lys Ala Gly Gly Ala Asp Gly Phe 
385 390 395 400 



ATC ATC TCG CCG GCG TTC CTG CCC GGA TCT TAC GAG GAA TTC GTC GAT 
lie lie Ser Pro Ala Phe Leu Pro Gly Ser Tyr Glu Glu Phe Val Asp 
405 410 415 



1248 



CAG GTG GTG CCC ATC CTG CAG CAC CGC GGA CTG TTC CGC ACT GAT TAC 1296 
Gin Val Val Pro lie Leu Gin His Arg Gly Leu Phe Arg Thr Asp Tyr 
420 425 430 



GAA GGC CGC ACC CTG CGC AGC CAT CTG GGA CTG CGT GAA CCC GCA TAC 1344 
Glu Gly Arg Thr Leu Arg Ser His Leu Gly Leu Arg Glu Pro Ala Tyr 
435 440 445 



CTG GGA GAG TAC GCA TGA 1362 
Leu Gly Glu Tyr Ala 
450 



(2) INFORMATION FOR SEQ ID NO: 2: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 453 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2; 



Met Thr Asp Pro Arg Gin Leu His Leu Ala Gly Phe Phe Cys Ala Gly 

15 10 15 

Asn Val Thr His Ala His Gly Ala Trp Arg His Ala Asp Asp Ser Asn 

20 25 30 



Gly Phe Leu Thr Lys Glu Tyr Tyr Gin Gin lie Ala Arg Thr Leu Glu 
35 40 45 



Arg Gly Lys Phe Asp Leu Leu Phe Leu Pro Asp Ala Leu Ala Val Trp 
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206 50 55 60 

207 

208 Asp Ser Tyr Gly Asp Asn Leu Glu Thr Gly Leu Arg Tyr Gly Gly Gin 

209 65 70 75 80 
210 

211 Gly Ala Val Met Leu Glu Pro Gly Val Val lie Ala Ala Met Ala Ser 

212 85 90 95 
213 

214 Val Thr Glu His Leu Gly Leu Gly Ala Thr lie Ser Thr Thr Tyr Tyr 

215 100 105 110 
216 

217 

218 Pro Pro Tyr His Val Ala Arg Val Val Ala Ser Leu Asp Gin Leu Ser 

219 115 120 125 
220 

221 Ser Gly Arg Val Ser Trp Asn Val Val Thr Ser Leu Ser Asn Ala Glu 

222 130 135 140 
223 

224 Ala Arg Asn Phe Gly Phe Asp Glu His Leu Asp His Asp Ala Arg Tyr 

225 145 150 155 160 
226 

227 Asp Arg Ala Asp Glu Phe Leu Glu Val Val Arg Lys Leu Trp Asn Ser 

228 165 170 175 
229 

230 Trp Asp Arg Asp Ala Leu Thr Leu Asp Lys Ala Thr Gly Gin Phe Ala 

231 180 185 190 
232 

233 Asp Pro Ala Lys Val Arg Tyr lie Asp His Arg Gly Glu Trp Leu Asn 

234 195 200 205 
235 

236 Val Arg Gly Pro Leu Gin Val Pro Arg Ser Pro Gin Gly Glu Pro Val 

237 210 215 220 
238 

239 lie Leu Gin Ala Gly Leu Ser Ala Arg Gly Lys Arg Phe Ala Gly Arg 

240 225 230 235 240 
241 

242 Trp Ala Asp Ala Val Phe Thr lie Ser Pro Asn Leu Asp lie Met Gin 

243 245 250 255 
244 

245 Ala Thr Tyr Arg Asp lie Lys Ala Gin Val Glu Ala Ala Gly Arg Asp 

246 260 265 ' 270 
247 

248 Pro Glu Gin Val Lys Val Phe Ala Ala Val Met Pro lie Leu Gly Glu 

249 275 280 285 
250 

251 Thr Glu Ala lie Ala Arg Gin Arg Leu Glu Tyr lie Asn Ser Leu Val 

252 290 295 300 
253 

254 His Pro Glu Val Gly Leu Ser Thr Leu Ser Ser His Val Gly Val Asn 

255 305 310 315 320 
256 

257 Leu Ala Asp Tyr Ser Leu Asp Thr Pro Leu Thr Glu Val Leu Gly Asp 

258 325 330 335 
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